Structural analysis of chat messages for topic detection
نویسندگان
چکیده
Purpose This paper studies the characteristics of chat messages from analyzing a collection of 33,121 sample messages gathered from 1700 sessions of conversations of 72 pairs of MSN Messenger users over 4-month duration from June to September of 2005. The primary objective of chat message characterization is to understand the properties of chat messages for effective message analysis such as message topic detection. Methodology/Approach From the study on chat message characteristics, an indicative term-based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalization of chat messages and extraction of features from icon texts and URLs are incorporated for message pre-processing. And Näıve Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions. Findings Indicative term-based approach is superior than the traditional document frequency based approach for feature selection in chat topic categorization. Originality/Value This paper studies the characteristics of chat messages and proposes an indicative term-based categorization approach for chat topic detection. The proposed approach has been incorporated into an instant message analysis system for both online and offline chat topic detection. Preprint submitted to Online Information Review 2 May 2006
منابع مشابه
Topic Modeling for Answers Detection in Online Game Chats
Helping behavior is a significant part of social learning process in online games. One type of such a behavior is answering questions in a chat. We provide a method to detect if the question asked in a chat was answered and by whom. Proposed method is based on topic modeling for chat messages and comparison of a detected topic of question with a topic of possible reply. We show its efficiency o...
متن کاملAutomated Chat Thread Analysis : Untangling the Web
As networked digital communications proliferate in military operational command and control (C2), chat messaging is emerging as a preferred communications method for team coordination. Chat room logs provide a potentially rich source of data for analysis in after-action reviews, affording considerable insight into the decision-making processes among the training audience. The multitasking natur...
متن کاملDetection of Topic Change in IRC Chat Logs
We attack the problem of topic segmentation in the domain of Internet Relay Chat logs. In this process, we examine the previous work in text segmentation using a variety of methods. After considering the pros and cons of the methods, we employ Text Tiling, pause detection, and latent semantic analysis because they did not require the usage of large pre-tagged corpora. With these systems in plac...
متن کاملTraffic Scene Analysis using Hierarchical Sparse Topical Coding
Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...
متن کاملDamage detection of structures using modal strain energy with Guyan reduction method
The subject of structural health monitoring and damage identification of structures at the earliest possible stage has been a noteworthy topic for researchers in the last years. Modal strain energy (MSE) based index is one of the efficient methods which are commonly used for detecting damage in structures. It is also more effective and economical to employ some methods for reducing the degrees ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Online Information Review
دوره 30 شماره
صفحات -
تاریخ انتشار 2006